AITopics | text sequence

Large language models (LLMs) process and predict sequences containing text to answer questions, and address tasks including document summarization, providing recommendations, writing software and solving quantitative problems. We provide a mathematical framework for LLMs by describing the encoding of text sequences into sequences of tokens, defining the architecture for next-token prediction models, explaining how these models are learned from data, and demonstrating how they are deployed to address a variety of tasks. The mathematical sophistication required to understand this material is not high, and relies on straightforward ideas from information theory, probability and optimization. Nonetheless, the combination of ideas resting on these different components from the mathematical sciences yields a complex algorithmic structure; and this algorithmic structure has demonstrated remarkable empirical successes. The mathematical framework established here provides a platform from which it is possible to formulate and address questions concerning the accuracy, efficiency and robustness of the algorithms that constitute LLMs. The framework also suggests directions for development of modified and new methodologies.

large language model, machine learning, natural language, (18 more...)

arXiv.org Machine Learning

2601.2217

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech

Neural Information Processing SystemsDec-24-2025, 04:23:36 GMT

Polyphone disambiguation aims to capture accurate pronunciation knowledge from natural text sequences for reliable Text-to-speech (TTS) systems. However, previous approaches require substantial annotated training data and additional efforts from language experts, making it difficult to extend high-quality neural TTS systems to out-of-domain daily conversations and countless languages worldwide. This paper tackles the polyphone disambiguation problem from a concise and novel perspective: we propose Dict-TTS, a semantic-aware generative text-to-speech model with an online website dictionary (the existing prior information in the natural language). Specifically, we design a semantics-to-pronunciation attention (S2PA) module to match the semantic patterns between the input text sequence and the prior semantics in the dictionary and obtain the corresponding pronunciations; The S2PA module can be easily trained with the end-to-end TTS model without any annotated phoneme labels. Experimental results in three languages show that our model outperforms several strong baseline models in terms of pronunciation accuracy and improves the prosody modeling of TTS systems. Further extensive analyses demonstrate that each design in Dict-TTS is effective.

dict-tts, dictionary knowledge, name change, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

4d4a3b6a34332d80349137bcc98164a5-Supplemental-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 15:51:56 GMT

machine learning, natural language, section 4, (18 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Preference-grounded Token-level Guidance for Language Model Fine-tuning Shentao Yang

Neural Information Processing SystemsOct-8-2025, 15:51:52 GMT

Aligning language models (LMs) with preferences is an important problem in natural language generation.

arxiv preprint arxiv, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Michigan > Washtenaw County > Ann Arbor (0.04)
(3 more...)

Genre:

Workflow (0.67)
Research Report (0.46)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.69)

Add feedback

c2c2a04512b35d13102459f8784f1a2d-Paper.pdf

Neural Information Processing SystemsAug-22-2025, 01:07:32 GMT

coco-lm, sequence, transformer, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Illinois (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Insight Rumors: A Novel Textual Rumor Locating and Marking Model Leveraging Att_BiMamba2 Network

Ma, Bin, Zhang, Yifei, Xian, Yongjin, Li, Qi, Zhou, Linna, Miao, Gongxun

arXiv.org Artificial IntelligenceAug-19-2025

With the development of social media networks, rumor detection models have attracted more and more attention. Whereas, these models primarily focus on classifying contexts as rumors or not, lacking the capability to locate and mark specific rumor content. To address this limitation, this paper proposes a novel rumor detection model named Insight Rumors to locate and mark rumor content within textual data. Specifically, we propose the Bidirectional Mamba2 Network with Dot-Product Attention (Att_BiMamba2), a network that constructs a bidirectional Mamba2 model and applies dot-product attention to weight and combine the outputs from both directions, thereby enhancing the representation of high-dimensional rumor features. Simultaneously, a Rumor Locating and Marking module is designed to locate and mark rumors. The module constructs a skip-connection network to project high-dimensional rumor features onto low-dimensional label features. Moreover, Conditional Random Fields (CRF) is employed to impose strong constraints on the output label features, ensuring accurate rumor content location. Additionally, a labeled dataset for rumor locating and marking is constructed, with the effectiveness of the proposed model is evaluated through comprehensive experiments. Extensive experiments indicate that the proposed scheme not only detects rumors accurately but also locates and marks them in context precisely, outperforming state-of-the-art schemes that can only discriminate rumors roughly.

artificial intelligence, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2508.12574

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

Incorporating Legal Logic into Deep Learning: An Intelligent Approach to Probation Prediction

Wang, Qinghua, Zhang, Xu, Yang, Lingyan, Shao, Rui, Wang, Bonan, Wang, Fang, Qu, Cunquan

arXiv.org Artificial IntelligenceAug-19-2025

Probation is a crucial institution in modern criminal law, embodying the principles of fairness and justice while contributing to the harmonious development of society. Despite its importance, the current Intelligent Judicial Assistant System (IJAS) lacks dedicated methods for probation prediction, and research on the underlying factors influencing probation eligibility remains limited. In addition, probation eligibility requires a comprehensive analysis of both criminal circumstances and remorse. Much of the existing research in IJAS relies primarily on data-driven methodologies, which often overlooks the legal logic underpinning judicial decision-making. To address this gap, we propose a novel approach that integrates legal logic into deep learning models for probation prediction, implemented in three distinct stages. First, we construct a specialized probation dataset that includes fact descriptions and probation legal elements (PLEs). Second, we design a distinct probation prediction model named the Multi-Task Dual-Theory Probation Prediction Model (MT-DT), which is grounded in the legal logic of probation and the \textit{Dual-Track Theory of Punishment}. Finally, our experiments on the probation dataset demonstrate that the MT-DT model outperforms baseline models, and an analysis of the underlying legal logic further validates the effectiveness of the proposed approach.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Artificial Intelligence

2508.12286

Country:

Asia (0.68)
North America > United States > Minnesota (0.28)

Genre: Research Report (1.00)

Industry:

Law > Criminal Law (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Filters

Collaborating Authors

text sequence

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Appendix for " Preference-grounded Token-level Guidance for 657 Language Model Fine-tuning " 658 Table of Contents

4d4a3b6a34332d80349137bcc98164a5-Paper-Conference.pdf

c2c2a04512b35d13102459f8784f1a2d-Paper.pdf

Large Language Models: A Mathematical Formulation

Dict-TTS: Learning to Pronounce with Prior Dictionary Knowledge for Text-to-Speech

4d4a3b6a34332d80349137bcc98164a5-Supplemental-Conference.pdf

Preference-grounded Token-level Guidance for Language Model Fine-tuning Shentao Yang

c2c2a04512b35d13102459f8784f1a2d-Paper.pdf

Insight Rumors: A Novel Textual Rumor Locating and Marking Model Leveraging Att_BiMamba2 Network

Incorporating Legal Logic into Deep Learning: An Intelligent Approach to Probation Prediction